A Set-Covering Approach with Column Generation for Parsimony Haplotyping
نویسندگان
چکیده
We introduce an exact algorithm, based on Integer Linear Programming, for the parsimony haplotyping problem (PHP). The PHP uses molecular data and is aimed at the determination of a smallest set of haplotypes that explain a given set of genotypes. Our approach is based on a Set Covering formulation of the problem, solved by branch and bound with both columnand rowgeneration. Existing ILP methods for the PHP suffer from the large size of the solution space, when the genotypes are long and with many heterozygous sites. Our approach, on the other hand, is based on an effective implicit representation of the solution space, and allows the solution of both real-data and simulated instances which are very hard to solve for other ILPs.
منابع مشابه
A Column Generation Approach for Pure Parsimony Haplotyping
The knowledge of nucleotides chains that compose the double DNA chain of an individual has a relevant role in detecting diseases and studying populations. However, determining experimentally the single nucleotides chains that, paired, form a certain portion of the DNA is expensive and time-consuming. Mathematical programming approaches have been proposed instead, e.g. formulating the Haplotype ...
متن کاملPhylogeny- and Parsimony-Based Haplotype Inference with Constraints
Haplotyping, also known as haplotype phase prediction, is the problem of predicting likely haplotypes based on genotype data. One fast computational haplotyping method is based on an evolutionary model where a perfect phylogenetic tree is sought that explains the observed data. In their CPM’09 paper, Fellows et al. studied an extension of this approach that incorporates prior knowledge in the f...
متن کاملPhylogeny- and Parsimony-Based Haplotype Inference with Constraints1
Haplotyping, also known as haplotype phase prediction, is the problem of predicting likely haplotypes based on genotype data. One fast computational haplotyping method is based on an evolutionary model where a perfect phylogenetic tree is sought that explains the observed data. An extension of this approach tries to incorporate prior knowledge in the form of a set of candidate haplotypes from w...
متن کاملApproximation algorithms for the minimum rainbow subgraph problem
Our research was motivated by the pure parsimony haplotyping problem: Given a set G of genotypes, the haplotyping problem consists in finding a set H of haplotypes that explains G. In the pure parsimony haplotyping problem (PPH) we are interested in finding a set H of smallest possible cardinality. The pure parsimony haplotyping problem can be described as a graph colouring problem as follows: ...
متن کاملSolving haplotyping inference parsimony problem using a new basic polynomial formulation
Similarity and diversity among individuals of the same species are expressed in small DNA variations called Single Nucleotide Polymorphism. The knowledge of SNP phase gives rise to the haplotyping problem that in the parsimonious version states to infer the minimum number of haplotypes from a given set of genotype data. ILP technique represents a good resolution strategy for this interesting co...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- INFORMS Journal on Computing
دوره 21 شماره
صفحات -
تاریخ انتشار 2009